vgg-16 model
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.48)
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.48)
Road Segmentation for ADAS/AD Applications
Ramasamy, Mathanesh Vellingiri, Kurniasalim, Dimas Rizky
--Accurate road segmentation is essential for autonomous driving and ADAS, enabling effective navigation in complex environments. This study examines how model architecture and dataset choice affect segmentation by training a modified VGG-16 on the Comma10k dataset and a modified U-Net on the KITTI Road dataset. Both models achieved high accuracy, with cross-dataset testing showing VGG-16 outperforming U-Net, despite U-Net being trained for more epochs. We analyze model performance using metrics such as F1-score, mIoU, and precision, discussing how architecture and dataset impact results. Road image segmentation plays a crucial role in applications such as autonomous driving (AD), advanced driver assistance systems (ADAS), traffic monitoring, and smart city development.
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.05)
- North America > United States > California (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- Automobiles & Trucks (0.75)
- Transportation > Ground > Road (0.55)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.70)
Impact of Batch Normalization on Convolutional Network Representations
Potgieter, Hermanus L., Mouton, Coenraad, Davel, Marelie H.
Deep learning has become a particularly important set of machine learning techniques and is widely applied to solve real-world tasks. At the same time, many open questions remain with regard to the ability of these deep neural networks (DNNs) to generalize so well, that is, their ability to perform well on unseen data. Although there is not yet a theoretical framework to assist us in reasoning about these models [2], the generalization ability of DNNs has been studied from many perspectives, such as the geometry of the loss landscape [3], statistical measures of stability and robustness [4], size of margins (distance to the decision boundary between classes) [5], and information-theoretic techniques [6], among others. A promising research direction is to study the characteristics of the internal data representations formed by DNNs, where each representation is the vector of activation values from a specific layer for a given sample. Aspects of these representations that have been studied include the size of margins in the representation space [7, 8, 9]; the'quality' of representations, evaluated using the consistency of class-specific representations and their robustness when combined [9]; and representation sparsity, that is, the number of non-zero elements in a data representation [10]. In this work, we also study the characteristics of the internal representations of DNNs, but focus on the effect that a very specific technique - Batch Normalization (BatchNorm) - has on internal representation quality. BatchNorm [11] is a popular technique used to normalize hidden activations when training DNNs. Networks trained with BatchNorm show desirable properties such as faster convergence and better generalization ability [12, 13]. Despite the success and widespread adoption of BatchNorm, the exact mechanisms by which BatchNorm achieves its performance remain unclear.
Leveraging counterfactual concepts for debugging and improving CNN model performance
Counterfactual explanation methods have recently received significant attention for explaining CNN-based image classifiers due to their ability to provide easily understandable explanations that align more closely with human reasoning. However, limited attention has been given to utilizing explainability methods to improve model performance. In this paper, we propose to leverage counterfactual concepts aiming to enhance the performance of CNN models in image classification tasks. Our proposed approach utilizes counterfactual reasoning to identify crucial filters used in the decision-making process. Following this, we perform model retraining through the design of a novel methodology and loss functions that encourage the activation of class-relevant important filters and discourage the activation of irrelevant filters for each class. This process effectively minimizes the deviation of activation patterns of local predictions and the global activation patterns of their respective inferred classes. By incorporating counterfactual explanations, we validate unseen model predictions and identify misclassifications. The proposed methodology provides insights into potential weaknesses and biases in the model's learning process, enabling targeted improvements and enhanced performance. Experimental results on publicly available datasets have demonstrated an improvement of 1-2\%, validating the effectiveness of the approach.
- Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.05)
- North America > United States > California (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
PPG Signals for Hypertension Diagnosis: A Novel Method using Deep Learning Models
Frederick, Graham, T, Yaswant, A, Brintha Therese
Hypertension is a medical condition characterized by high blood pressure, and classifying it into its various stages is crucial to managing the disease. In this project, a novel method is proposed for classifying stages of hypertension using Photoplethysmography (PPG) signals and deep learning models, namely AvgPool_VGG-16. The PPG signal is a non-invasive method of measuring blood pressure through the use of light sensors that measure the changes in blood volume in the microvasculature of tissues. PPG images from the publicly available blood pressure classification dataset were used to train the model. Multiclass classification for various PPG stages were done. The results show the proposed method achieves high accuracy in classifying hypertension stages, demonstrating the potential of PPG signals and deep learning models in hypertension diagnosis and management.
- Asia > India > Tamil Nadu > Chennai (0.05)
- Europe > Switzerland > Basel-City > Basel (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- (2 more...)
- Research Report > Promising Solution (1.00)
- Research Report > New Finding (0.88)
Deep learning for AI-based diagnosis of skin-related neglected tropical diseases: a pilot study
Background Deep learning, which is a part of a broader concept of artificial intelligence (AI) and/or machine learning has achieved remarkable success in vision tasks. While there is growing interest in the use of this technology in diagnostic support for skin-related neglected tropical diseases (skin NTDs), there have been limited studies in this area and fewer focused on dark skin. In this study, we aimed to develop deep learning based AI models with clinical images we collected for five skin NTDs, namely, Buruli ulcer, leprosy, mycetoma, scabies, and yaws, to understand how diagnostic accuracy can or cannot be improved using different models and training patterns. Methodology This study used photographs collected prospectively in Côte d'Ivoire and Ghana through our ongoing studies with use of digital health tools for clinical data documentation and for teledermatology. Our dataset included a total of 1,709 images from 506 patients.
- Africa > Ghana (0.27)
- Africa > Côte d'Ivoire (0.27)
- North America > United States (0.17)
- Asia > Japan (0.16)
Multi-input segmentation of damaged brain in acute ischemic stroke patients using slow fusion with skip connection
Tomasetti, Luca, Khanmohammadi, Mahdieh, Engan, Kjersti, Høllesli, Liv Jorunn, Kurz, Kathinka Dæhli
Time is a fundamental factor during stroke treatments. A fast, automatic approach that segments the ischemic regions helps treatment decisions. In clinical use today, a set of color-coded parametric maps generated from computed tomography perfusion (CTP) images are investigated manually to decide a treatment plan. We propose an automatic method based on a neural network using a set of parametric maps to segment the two ischemic regions (core and penumbra) in patients affected by acute ischemic stroke. Our model is based on a convolution-deconvolution bottleneck structure with multi-input and slow fusion. A loss function based on the focal Tversky index addresses the data imbalance issue. The proposed architecture demonstrates effective performance and results comparable to the ground truth annotated by neuroradiologists. A Dice coefficient of 0.81 for penumbra and 0.52 for core over the large vessel occlusion test set is achieved. The full implementation is available at: https://git.io/JtFGb.
- Europe > Norway > Western Norway > Rogaland > Stavanger (0.05)
- Europe > Norway > Northern Norway > Troms > Tromsø (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Hematology (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Content Masked Loss: Human-Like Brush Stroke Planning in a Reinforcement Learning Painting Agent
Schaldenbrand, Peter, Oh, Jean
The objective of most Reinforcement Learning painting agents is to minimize the loss between a target image and the paint canvas. Human painter artistry emphasizes important features of the target image rather than simply reproducing it (DiPaola 2007). Using adversarial or L2 losses in the RL painting models, although its final output is generally a work of finesse, produces a stroke sequence that is vastly different from that which a human would produce since the model does not have knowledge about the abstract features in the target image. In order to increase the human-like planning of the model without the use of expensive human data, we introduce a new loss function for use with the model's reward function: Content Masked Loss. In the context of robot painting, Content Masked Loss employs an object detection model to extract features which are used to assign higher weight to regions of the canvas that a human would find important for recognizing content. The results, based on 332 human evaluators, show that the digital paintings produced by our Content Masked model show detectable subject matter earlier in the stroke sequence than existing methods without compromising on the quality of the final painting.
How Compact?: Assessing Compactness of Representations through Layer-Wise Pruning
Jung, Hyun-Joo, Kim, Jaedeok, Choe, Yoonsuck
Various forms of representations may arise in the many layers embedded in deep neural networks (DNNs). Of these, where can we find the most compact representation? We propose to use a pruning framework to answer this question: How compact can each layer be compressed, without losing performance? Most of the existing DNN compression methods do not consider the relative compressibility of the individual layers. They uniformly apply a single target sparsity to all layers or adapt layer sparsity using heuristics and additional training. We propose a principled method that automatically determines the sparsity of individual layers derived from the importance of each layer. To do this, we consider a metric to measure the importance of each layer based on the layer-wise capacity. Given the trained model and the total target sparsity, we first evaluate the importance of each layer from the model. From the evaluated importance, we compute the layer-wise sparsity of each layer. The proposed method can be applied to any DNN architecture and can be combined with any pruning method that takes the total target sparsity as a parameter. To validate the proposed method, we carried out an image classification task with two types of DNN architectures on two benchmark datasets and used three pruning methods for compression. In case of VGG-16 model with weight pruning on the ImageNet dataset, we achieved up to 75% (17.5% on average) better top-5 accuracy than the baseline under the same total target sparsity. Furthermore, we analyzed where the maximum compression can occur in the network. This kind of analysis can help us identify the most compact representation within a deep neural network.
- North America > United States > Texas > Brazos County > College Station (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)